96 research outputs found
Compactly supported radial basis functions: How and why?
Compactly supported basis functions are widely required and used in many applications. We explain why radial basis functions are preferred to multi-variate polynomials for scattered data approximation in high-dimensional space and give a brief description on how to construct the most commonly used compactly supported radial basis functions - the Wendland functions and the new found missing Wendland functions. One can construct a compactly supported radial basis function with required smoothness according to the procedure described here without sophisticated mathematics. Very short programs and extended tables for compactly supported radial basis functions are supplied
Application of Fredholm integral equations inverse theory to the radial basis function approximation problem
This paper reveals and examines the relationship between the solution and stability of Fredholm integral equations and radial basis function approximation or interpolation. The underlying system (kernel) matrices are shown to have a smoothing property which is dependent on the choice of kernel. Instead of using the condition number to describe the ill-conditioning, hence only looking at the largest and smallest singular values of the matrix, techniques from inverse theory, particularly the Picard condition, show that it is understanding the exponential decay of the singular values which is critical for interpreting and mitigating instability. Results on the spectra of certain classes of kernel matrices are reviewed, verifying the exponential decay of the singular values. Numerical results illustrating the application of integral equation inverse theory are also provided and demonstrate that interpolation weights may be regarded as samplings of a weighted solution of an integral equation. This is then relevant for mapping from one set of radial basis function centers to another set. Techniques for the solution of integral equations can be further exploited in future studies to find stable solutions and to reduce the impact of errors in the data
Information Splitting for Big Data Analytics
Many statistical models require an estimation of unknown (co)-variance
parameter(s) in a model. The estimation usually obtained by maximizing a
log-likelihood which involves log determinant terms. In principle, one requires
the \emph{observed information}--the negative Hessian matrix or the second
derivative of the log-likelihood---to obtain an accurate maximum likelihood
estimator according to the Newton method. When one uses the \emph{Fisher
information}, the expect value of the observed information, a simpler algorithm
than the Newton method is obtained as the Fisher scoring algorithm. With the
advance in high-throughput technologies in the biological sciences,
recommendation systems and social networks, the sizes of data sets---and the
corresponding statistical models---have suddenly increased by several orders of
magnitude. Neither the observed information nor the Fisher information is easy
to obtained for these big data sets. This paper introduces an information
splitting technique to simplify the computation. After splitting the mean of
the observed information and the Fisher information, an simpler approximate
Hessian matrix for the log-likelihood can be obtained. This approximated
Hessian matrix can significantly reduce computations, and makes the linear
mixed model applicable for big data sets. Such a spitting and simpler formulas
heavily depends on matrix algebra transforms, and applicable to large scale
breeding model, genetics wide association analysis.Comment: arXiv admin note: text overlap with arXiv:1605.0764
Adaptive Softassign via Hadamard-Equipped Sinkhorn
Softassign is a crucial step in several popular algorithms for graph matching
or other learning targets. Such softassign-based algorithms perform very well
for small graph matching tasks. However, the performance of such algorithms is
sensitive to a parameter in the softassign in large-scale problems, especially
when handling noised data. Turning the parameter is difficult and almost done
empirically. This paper constructs an adaptive softassign method by delicately
taking advantage of Hadamard operations in Sinkhorn. Compared with the previous
state-of-the-art algorithms such as the scalable Gromov-Wasserstein Learning
(S-GWL), the resulting algorithm enjoys both a higher accuracy and a
significant improvement in efficiency for large graph matching problems. In
particular, on the protein network matching benchmark problems (1004 nodes),
our algorithm can improve the accuracy from by the S-GWL to ,
at the same time, it can achieve 3X+ speedup in efficiency
- …